Scale Invariant Multi-length Motif Discovery

نویسندگان

  • Yasser F. O. Mohammad
  • Toyoaki Nishida
چکیده

Discovering approximately recurrent motifs (ARMs) in timeseries is an active area of research in data mining. Exact motif discovery was later defined as the problem of efficiently finding the most similar pairs of timeseries subsequences and can be used as a basis for discovering ARMs. The most efficient algorithm for solving this problem is the MK algorithm which was designed to find a single pair of timeseries subsequences with maximum similarity at a known length. Available exact solutions to the problem of finding top K similar subsequence pairs at multiple lengths (which can be the basis of ARM discovery) are not scale invariant. This paper proposes a new algorithm for solving this problem efficiently using scale invariant distance functions and applies it to both real and synthetic dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Discovery of Variable-length Time Series Motifs with Large Length Range in Million Scale Time Series

Detecting repeated variable-length patterns, also called variable-length motifs, has received a great amount of attention in recent years. Current state-of-the-art algorithm utilizes fixed-length motif discovery algorithm as a subroutine to enumerate variable-length motifs. As a result, it may take hours or days to execute when enumeration range is large. In this work, we introduce an approxima...

متن کامل

Efficient Discovery of Time Series Motifs with Large Length Range in Million Scale Time Series

Detecting repeated variable-length patterns, also called variable-length motifs, has received a great amount of attention in recent years. Current state-of-the-art algorithm utilizes fixed-length motif discovery algorithm as a subroutine to enumerate variable-length motifs. As a result, it may take hours or days to execute when enumeration range is large. In this work, we introduce an approxima...

متن کامل

Toward Unsupervised Activity Discovery Using Multi-Dimensional Motif Detection in Time Series

This paper addresses the problem of activity and event discovery in multi dimensional time series data by proposing a novel method for locating multi dimensional motifs in time series. While recent work has been done in finding single dimensional and multi dimensional motifs in time series, we address motifs in general case, where the elements of multi dimensional motifs have temporal, length, ...

متن کامل

Development of an Efficient Hybrid Method for Motif Discovery in DNA Sequences

This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...

متن کامل

G-SteX: Greedy Stem Extension for Free-Length Constrained Motif Discovery

Most available motif discovery algorithms in real-valued time series find approximately recurring patterns of a known length without any prior information about their locations or shapes. In this paper, a new motif discovery algorithm is proposed that has the advantage of requiring no upper limit on the motif length. The proposed algorithm can discover multiple motifs of multiple lengths at onc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014